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INTRODUCTION 



The recent publico"* ^SSS^S^^T 
«Wscs of g«» and protein e™" l >°" ferfbwion 0 f spe- 
SS many ^ich have ^ 0 ^ t C c r P rotcias from the 
fic famihes of P^^^/Ses of related proteins from 

2c species) ^f^^ r ( !?2S^U« ^ 
fferent species). The power °* ^^Snties in gene or 

Seen two or more sequences. It am ^ d c 

of identity heween sequences infon nation about 

analysis o£ primary sequence pww»^ protcin(s) under 
St-oA or ^ e S S;°of a protein thatjeter- 
investigation, and it _ » the a ^ng. ^ 

mines its function. lnerL < !% ran , crio tome analysis (299) uuo 
^ding from •ftHSC* the protege and 

structural genomics (W) ana »™» j e (89) 159, 271). 

This present review .s design *» bined witn mforma- 
JTysis of protein «qu*»^£5 function to uncover 
Si on tertiary ^ ? SonX Srse proteins, the cup™, 
QunerfamiW of functionary " wteria and archaea to 
and to Sce^eir -^^E^ Specifically, 
Saryotcs including ^"^Ktopriaiiihea^ 
this^th leads from small en^s found ^ d 



Tic term rap* (f™» *? ^gnST. S"^ 1 s, ™ :, ™" 

aVso includes ganvoft h r <nornfi r. p » ( ■ g 5 nenSl onai struc 
(S. and h^CTS^5^^ tJta «cd the molecular 
mres of these proteins (155, }H) » unusual pro- 

Tease-resistant protein with ^^J^risrie of the cupin 
. 1 2.3.4) activity (173) The »»» ta which motif 1 corre- 
domain is a two-motif sequence (69) in 



protein phaseobn Q^^^E strands E and F, that vanes 

Lntaining) is » "»°^SSS °** c bacterial Tft 
in length from 15 residues inmanyo ^ ^ F 

more than 50 residues msone*™ ' *L£ost>c feature of 
Otme exact number of residues » feature is 

eich subclass of protein. The oth « ^ ^ prisc 

the overall orsanrzationj Mb P^' ^.yke pro- 
either a single domain, as n the genn i ^ CWX c. This 

feins (46, 200). or * /uptote^ «J tnc storage proteins and 
latter structure was identified ^unionary progres- 

SlTconsidercdtobepaxt of (20). It now 

stan from a siogle-domain, ? u f £ weill actually oc- 
sSifpossible that the ^^JSS^ution leading to 
e^^pwlWJ^ SS'S.b. For example two 
the two^omair, prcttg m cyanobactenum Syr, 
such duphcated protems, one bacterium Bacilli* 

?£cysl ™* %iCt£eYa n dC5ane(69),who 
were identified in l9 ^££nip«ition in an oxalate 
deWibed ^^^Tl liTS the wood-rotting 
decarbcayta* (OXDQ & inncd Ftammulina veluitpe*)- 
lUJcollybio ^^S^Dunwcll and Gane prupo^ 
On the basis of ^.^SeSJhfflt storage proteins the 
the hypothesis that all the nignw j> juch du pn..,. 

Component of the human d^U evoK 
^ted microbial sequences. U now w ^ thc stor . 

^ In this review, the mdividuai me aEnm0 ^ 

family^- des ?! bed " £55 -d function (where 
seouence, in addition to ,hur *™? on fe given to a flailed 
Sse are Vnown). .^"ST^ the prokaryote 
analysis of the cupm gene "\ t scquen ces described 
3*c most *^J^£giJ of the bioloeica 



AMiLYTlCAL METHODS USED TO IDfiNTlFY 

».• n! >Ki<;is was the identifl- 
The original starting point for ^h.s nonapcpt idc se- 



0 



Voi- 2000 

fingcrpnnt ° a t ^f n 0 S , r ^alyses are now outdated 
"■te they used ^^^^^g^m^m2 

£W£2$Z£'+* ! - than onc - 

third of the available data. f this study was 

the con^eaw" conserved motif 2, G(X) S 

^SS^wSffiSSUES^ spacing of 15 to 
PXG(X) 2 H(X) 3 NJ wiui a two-motif signature 15 lo- 

cated within several conserv eo »4 ndudes gen nin and 
(49) (release 34.1) domains ^J^f^t&teA from 
U-n-like proteins ^g^^^SSl^ 

3ES2 SSt ^ be- 

cause of the varied "^J^S^f parted using the indi- 

vidual cupin motus ^tSeToSSdatabase searches 
spanning the two moufc. A sen* of rterati e 

was conducted usmg the gapped Wjjtf ™ ^ analvsis 
225) P ro 8ra mmes.TJepnn«p^^3cus ^ ^ ^ 

was the n"n rcd ^ da °^ G ^C,Wonnation, National In- 

Natiofl ai j^^jsff'SSSitSt ^ «» - 

stitutesofHealtn.Bethesaa. Mo-, > subtiL-ist (http: 

ducted on the &™ m * .^^^^aZ #nomc of Syn* 
//www.pasteu,^ 

0<^>.^ Qg^j^ genomes, either 
/cyano/scarcWhtol) i ^^t^ via GenBank and 
complete or unfinished, were , / ^. ncb i. n i m .nih 
othc? sites, ««W£^S) C fcBte» of Ge- 
.Swn^l^^*STM) (http://www.tigr.org/tigr 

United Kingdom) W*^™^ co din K regions and 

To identify V^^.^^^S^*^ ^ 
tneir protein products ^see below), pam cu rc _ 

paid to TBlastN searches. In many framc 
Staled significant mate** m ™^™™<* JpSTD 
(OTF) SSS \h of ?nsertions or dele- 
sequence. This suggoiw u -—senuence of c oning or 
tiorn in the DNA sequence ; as a ^ duCled on 

sequencing errors. Manual 'fW* Septidc sequences, 
S ucb sequences to genera* *^J*£* MgntnC nts of 

ing projects (^Sf^SL B. iubdlis (166), in order to 

Results of this analysis are reported below. 

MEMBERS OF THE CUFIN StfPERFAMlLY 
Tnefollowin^^^ 
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cupin domain, complex 

plicated structure wth twe •?J« d 2SdS between the wo 
rized according to the number of rest dues 

IS&m* Show the two conserved ™»*^»c of 1S io 
increase in intermotrf spacing from ^^^^rcscntaiivc 
roany .taoW -KJS^JS^ confirmation 

resolution of their tertiary *™f ' "J^hcsfa-w a P - 

BLAST (7). 



SINGLE-DOMAIN CUPlNS 

The great majority of cupin f^^£^& 
conserved domain at ttaweof tejgj • be J*. 

grouping, the <^ "JSSJrfft? Sable intermotif spac 
egorized not onty on^e b^ of *e var,a ^ ^ 

ing within this donuun but also oni ii 

conserved residues withm each motd o a I 

within the intermouf region^ the, g^Jgy mot if 

teto"^ 0 ^,?^ intermotif spacing 

has 16 residues (Fig. J). TM rmnuB" f 
found in cupins is 15 residues; th, L'jSS* >erc arc stcric 
together with the ^«^lS c SX7o? P ermit a shorter 

ing- 

Phosphomannose Isomcrases 

Pnosphomam.o.e ^^orKon^tSn^ 
zymes that catalyze ; the ^^Sm>«» Levant to 
^ateandfru^^^^ 

this review is that ot me type n j „ t u waVS including cap- 
be involved in a variety of ^ b ^^ m ^^ mC tabol.srn. 
sularpolysacctarideb^es^d^^ c 

Such enzymes which con am th ^ tein of 

separated by 15 aa, east e " hc ^^3 doina i n of a bifunc- 
abL 120 to » " « «^pjir»d GDP-man- 
tional enzyme (ca. 480 aa) witn q _ An w . 

nosepynjphosphoiylasetGMPHEC^ ; ^ 

ample of the latter type of £^.^ ona , cn ^ mc en- 
practical imoor^ce « ^ ^ *e first and 

coded by algA (179, 196, 25»), wn ^° p ' MI catalyzing the 

third steps in ^^^JjS^SSl U*-^ 
first step (152). This n^omgse 
auluronic acid and p-D-mannuromC acia ^ . is 

nomic importance, ^^SSS^ from bac 
usually extracted from marine ificanC e because of 

teria (237). Alginate also has raed«»isign y _ 
£ production by Pseudomuna; a ^°"T$ a 'conversion 
si on P of this bacterium |«a«JSS2"£ presence of 
is induced by several » nd '^S y , growth of the bac 
m ctaboUc inhibitors. *"J2ffirt in 
tcria in the lungs of cystic fibrosa \P a "^ h inabil5t y of antibi- 
su ch patients * «S5S?£5?id to the fact that the 
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funeTS walls, PW inhibition k a target for drug discovery 

toe active site, in the context of the conserved hisbdm* in yhe 
Z> cupin motifs it is pertinent to note recent evidence (220) 
the existence of a His residue in this site in a PM1 from 
xLtZZ ^P*™-. this particular PM1 is cons.dcred to 
be a metaUoenzyme and is activated by zinc. 

Polyketide Synthases (Putative Cyclases) 
The oolykende pathway (115, 125, 194, 235) accounts for toe 
bioSntS of nSy of the thousands of 
Solitcs, including antibiotics and pigments. Among ^hese 
oroducts is cummycin (26). an antibiotic produced tySaqto- 
^T^cS based on a polykeode skeleton coasting of 
TSodS orselfinic acid-an unreduced version ' 
^Tadd and the simplest of all ««5£SJtf%J 
was found (25) that the gene cluster responsible for the syn 

oenTchiStS responsible for the synthesis of a grey spore pig- 
SttSSSof before sponuation in the aerial rnyce- 
«n and subsequent studies (36) demonstrated the wide- 

other Strepiomyces spp. Of specific interest t > thai rone- 
sequenced *e homolo^gr jpcjg- *g~*S$> 

heSt^'cSS of thie gene products ; « 
.ithn,.oh it is sueeested to be a cyclase (148, 318). Sequence 

See of 15 residues, within a total protem wm ofJJJ* 

relatives, such as the sequence »}^}™^2£&\v 
jaMb and the 140-aa Pepl ^^M^^Tbroad- 

recognized a "possible 
So^Xo^ 

the 77-aa "membrane-spanning protein" gil 1017816 from 
the //-aa " 1C ™ M ', ^ 199? However the start codon for 
Strcptomycts coebcolor (181, W^JJ* considered to 
this sequence has been Similar to a 

encode a 115-aa protein (gil 5457273) that is » 
7<Ea polypeptide encoded by nucleotides 243111 to 243347 
from contig 7 of Streptococcus pyogenes. 

Di xyEenascs 
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(bicupins); within each subcategory the individual meinbers 
K recognized on the basis of a characters imer-motif 

TlSdroxyanthranUatc 3,4-dioxygenase P -HAO) (EC 1.13 
P^Sn^otifspVcmg 

Sriat cleaves the aromatic ring of 3-hydroxyanthraml.c 
acidTo proouce 2-ammo-3-carbo^uconic scmialdehyde an 
SS£ in the synthesis of the exdtotoxm qmnohmc : acid 
SivS compound kills neurons by activation .of^-ow^l- 
gasparratc receptors, and inhibition of 3-HAO 
Dhannaccutical Target (35). The enzyme is well chararferaed 
t£Sb (210) and is part of .he kynurcnm* pathwaj for 
the catabolism of tryptophan. Recently, the J*** Jg« 
YJmSc has been shown (164) to encode a 3-HOA 
- Si 1353060) homologous to the human equrvalent (WO) and 
S bdM r Cnam ed BNA1 (biosynthesis of nicounie acid). A 
JfrySar polypeptide (E value 5c-64) is encoded I by ^part of 
7\S (gnllStanford 54761 Calbicans_Con4-2428) from 
CaSalbZL Alignment of these 3-HOA -*P->«*™" 
a nSe difference between the Saccharomycn sequence and 
&e SSer sequences, in that the former protem has an inter 
soarine of 23 residues compared with 19 for the other 
S£S3* SertiOB of 4 aa occurs in the loop between 
the E and F strands of the barrel 

i. cSL. -a. -« tang— -^S^S 

Se^(296) ?uSaT(232), and QmahabdUI.^ V»* 
^£2$B characUU with *c dc **t b^ena 

(gl| 26355*0). itrepiamy enzyme 9 known, to 

bacterium tuberculosa (gi 1 2896702)^ i n **^. . (312) its 

be monomeric with one atom of uon P^S^* (247). 
acdvity is strongly reduced by chelators of Cu andi-c ^ > 



■ SpherulinS 

The life cvde of the simple slime mold Physarum polyceph- 
to be ccU-waU glycoprotems. It was discovered 4 



.-.v.-. 
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lnterestiaga<i dl ? on Ji^ * P r ° vlded ?« The discovery 
n position W n ^ main of several seefl } shar cd 

ntrotvpo sl0 ? n ^ provided strong ^L^r 
? pofycephalu*) P^^nmon »« c f Signed to these 

To date, no ^ they do not seem heir possible 

^ruhns, although tncy Y valU w consiaer t thc 

w CHS). of wbal ' b 0 fS* the gen- 

.nditions ^^Skbcwce" it is inteiesungW 



7S 

Snplctc idcnW ! n o?t 0 » a^ Since the discovery ^ 

Siotif * adng £n?U27). there *f '^nat the latest 
gp in a W« identifier L such AflI ^, 

estimates gwe a touu . , gen orne to on ^ has yet been 
!L hest-characteruea k M wcve r, no tut"--" exception of 
Unpublished Ld£> ^^^th^egngtef^ 

^ cdt0 So^Sh do* ^havc .O^vn a'nalyscs^ 
Sous pto**^^?Tl8SSl) ^Scolvetopmen- 

Germin-U^P^varioxts stupes have 
.„, Bta2 es in ?l ants - \ n i ant development- develop- 



^1 



. i ike Protein KW ~ v , 

biochemical ana > ln e nnu & ^ oxu . i 

7 ai 162. 23». 310 >_!l!La wheat Retrain * a» . mM)0r t»nt 



2Tta the cotton aj£ ^ to » » 

^S^^KZm ■ fn* of mandarin 

S«S« of ripenmg ^bcd). and 
«« Fruit ripetuafi- ^ muiiv/cU, uny -.nucnceS- 

Watibbc* P' n ^^;l7745848)cxpr^in sifnaaf ly 



^atch trade one vv wot ^ founot tc 

.nultimene (3 \\ffh chcmlC al Jg^rtto have recetW 
sistance to "^ftbese unusual V^V ^t, and » 
hydrogen PO^^ rcalizatxon *f ar e members ot 
been e«P£»» cd h S«^od other eereals^Ob) ^ ^ 
^ tolWeS *SS»S thai *eir .^nce ^^ ^ty 

the cupm.t a .g v i 0 ^ a £otlC non of tnca 

fonment » l*eiy 



unsurprsmo ve cell w<»"» 

characterized by 
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responses. Evidence' ^J^^Stal -»*» ° f nodu " 
be tween Pj^^aito^Sgaaon, of specific 
lauon in legumes, as wc» 

pathogen responses "^^L te evidence for the occur- 
(i) Nodu ation in legumes. me nni ™ f fa 

rence of a GU> in a legume ^^^ n TV^ 
onanism of ; f JJJl, 1 ceils, a\ though 

this was not recognized as « ^ JL^ attachment 
tion (284). T*e in^p m ^^ e ^ nt pfiS) bac- 
proccss involves ^cadnear, « ^ 2g5) _ Usj ^ 

tcrial surface protein of a ^™Ste« »**y. a P uta " 
assay based on the suppression iflc0 from 

Replant receptor molecufcfor gJ^JJg aa of this 
cell walls of pea '^^^S-QDL^VADYAS 
protein were detenmned Itc be ^^/^ o£ this 

WVNGFASK(0)(P Q)U. ^J^™ sequence is 
. stu dy found »^t?ESc oSfw an^mbuto/** 
very similar (69% &^?^J!££L to the discussion 
SI (Bill^W- Of P"*Jrf£g Sa Preceptor 
elsewhere in this review » to ^Jj^ 1 w u wall with an • 
molecule was most ■ casUy "Jg ^ finding sug- 

aqueous solution of oxalate anch oring. func- 

gits that ^jffiCJSSd-U 

tion,orstabUiryandadcitotr««rcu^i* 

oxalate to the level o^ 1 ^^^ in that environment 
quentfimctionalconwlofotherp^temsmt GLp 

q In addition to this «^^^"3Spo*«oottl^ 
related to bacterial attachment to tnfi > wan oi ^ b o{ 
t £ known that oxalate itself * fo ^ftTCtotion of 
?0 «M in fnbabcan {Vic* aba )Tjoduh * $ baCteroid 

wa ,er stress id such ^"^jfoxtlie acid by 55% 
OXO fourfold and "J"^^^^ in this location 
(295). It is sugg^ d ^ ^Xtratt forbactcroidsand asa 

TSSSS 

against pathogen attack J*^ 1 "*^ antimicrobial com- 

tStt include the P^^^Sr^teins in the cell wall, 

pounds, the ^il^^CSSbohydrate polymers, 
^esynthesfeofcellwaj^gt^ca y^ ^ cn 

and hypersen^trve «>l dea^^ttou^^ d 
response was among the earnest m established until 

gSSn (] 70, 174), such a «**5Jg £3£ with other 
the identification of ^ n " ad i, fi/umerifl (sy* 

studies on the mtcractjors rf ^ 3 03, 322) and 

Erirypte) wth lc aV " Sown hat a specific- 

wS (129). SubsequenUy.^ been ^ 

pathogen-response OXO t ^°^. ation with mildew; the 
barley mesophyt cells 6 h after , Additionally, a 

enzyme accumulates after ^ ^ which ^ 

related sequence has been isouue * This paroc- 

papffla-mediated resistance to fhd spc - 
5£ transcript peaks at about ^ 
dfi callyinthe ep,dermalee^^^ fallows the 

ral and spatial P atte ^ J.-i^^ned on the inner surface 
formation of papulae, apposiUons d o{ proteins, 

of th e epidermal wail -g^ , j2 l S2eoi-Jlg corn- 
polyphenols, ^ £ iniscent of the complex 

5E2. S-J^ESKi- to above. It has been su g - 



ge5t ed that ^j^^SSS 

^rrtSe i «l^in nSboring epidermal or » 



levant to note the tenaa^ . 
„ * *w there arc common linK; 



^abinoxylans or ^^£ t ^^^nS« 
There is increasing evidence that there : m . 

be tween the transduction tg^J**^ % gcv 

response to bioeic ^ ^'f^^ZSl^oT^. 
species are irrvolvedm the p ant^ronmen io „ Qj 

308). In particular, the role of 'J*&£™ f n this contcxl 

^J^J^^S^lSS" role of th, 

it may also be relevant w cut r ^ crystals of cal- 

(B. Fristensky, ^^^Siion leaves ot 

T,,. first evidence ror mdu^on ^^ley J* (126. 

iornmon ice ^\ M ^^^^l^ ol Crass,!: 
tive halophyte and a ™*«W£Z ^attnent with 

lac ean acid metabolism ^6^"^^ 0x8 i ate content of 
'high levels Of salt. It was found (^ ^hat the^oxa ^ ^ 

thfleaf bladder ^s -creased ^from<l «M» ^ 
le vels were increased from! "5mM ^ ^ 

related to the modulauon iof ! the more - 

- « h J , S5»«i Adored m det, 
^ongthemostinter^ 

abiotic stress . ^spA ^ Jgy^ of ^ (Fpputo 
protein highly exprcssco m cmvu otein B >ho 

<*P° sed . t0 ^orSonTnd by osrnotic and cold 

stresses. In a recent smdy oi gr inPopanu <oi7Kr«- 

lowcr level of expression of BspA was^ rf waler 

jB.toinftp-tert*^- fJ^A^r^ tributes to mem- 
stress. It has been suggested tha riBuficmce i« rela- 
brane stability, a ^ 0 ^S^Swhich reccnt.y 
tiD „ to » STmdudc manganese den- 
have been shown to " .... ^2). aluminum 
ciency in tomato roots (P ,2y7 ?J5'n0ffi heat treatment in 
lrca nnent in wheat ^J^^ 0 rte (gi 1 2952338, gil 
barley (298), »d ^1^1^890^ gil 5827572 
3201969; see also tomato ta r A .^'^ of ^cse studies is thai 

utilizing a P*^;?^^ in transgenic 

showing ^ n ^^ C Z^^^ Md ^ 
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Auxin-Binding Proteins 



""f 300) and 

a reduction in chiasmi : pH n «™^ c .U 

?em , h« he oeB ?de ^pons&le for binding the carbox- 
, mcludes the ? e P* f r "£ ti aCid . n,* motif, known as 
? 2 SoSmSS bought to be equivalent to the 

l( *S£2 1 ^th ^ eupTn noSon (69), a 

.916807) and ABP 20 (gil 1V1 "»"^ p cent analysis 
ty to bind auxin, albeit at low -to the 

5* sequences shows a greater ^*gg«g GLP3 
• s (the closest ne ighbor P ; value * i ^ 
1755164] from /I. Afl*B««) 10 any 01 
er characterized ABPs. 

Epimenises 

terial an d archaealee J^^S^^V 
:s, such as dTDP-4-ai.ny ™V, . , EC 5 1.3.13), which 
mi as dTDP-L-rhamnose ^^K^ dT DP-4W- 

.vcrts in 

WL-mannose. These enzyme* u oarale d by 

Contain the *^ffiJ%S£^*3* 
listance of 28 rescues; both ^° tl ^ 0 ™ ded £ % C (or 
nscrved histidine residue. They ^ cn "T l89 y l93( 2 Q5. 
uivalent), part of ^ * ^^^d^ lw** 
6). Most tfb operons start «^^ C L ' this cniS - 

derant to the theme of th* review 10 ^. reSistant 
rotcins are, as a rule, folded ™» ""J* 0 ^ 
anfonnarions, consistent with the digestive 

l0 mic%ortance as aqueous ^gJ^SJ,,^ in- 
liverse industrial and food application. » the 

•lude aanthan gum (^ om0 ^ ^f^Joduoed by spe- 
Iphingans (e.g., S'Uan, wdan, "JJ^J^aK- 

various sphingans be *" u ^f ^^297) of many invasive 
to the protective capsule!. (224, 24* • / 
pathogenic bacteria (e.g. alginate). 

MULTlDOMAIN PROTEINS ; WITH A SINGLE 
CUP1N DOMAIN 

ln the multidomain proteins ^ • ^3£S£S£ 
conserved cupin element does not he at *c cor ^ 

b „t instead ."l^-^^fiSK^ of proteins in this 

factors. 



[OBIOU MOL. BlQL. KEV. 



AraC-Type Transcription Factors 

Of al. the bacterial ^^^^ffiSi^X 
best-cha^cxized^ 

lator of the arabindse' P*^ a W; .""^ dasscs on 

100 members, which ican be '^riated primarily 
a functional basis. ™* c S pathogenesis, 

with carbon metabolism, stress W"* 4 ™* ^lol the 
with the former category -^JJgiffcelWObR). 

250 to 300 residues » to ngtt ^.^^^onconserved 
minal of about 100 »^ fc ^ th 7^ noteile (44). 
N-terminal domain availablcon the DNA bind- 

There is much more ato»2J 0 n N . tctminal 
ing component, although the specific J la * 0 t tQ 

section 5»^^^SSi^ " d «« i,cd 
te ^STS^SJ£i (250, 255) analysis over 
structural (269, 27U) ana m > . ^ tion comprises an 

arabinoUmd^ J barrc l- 

the DNA-b nding ™ ,» ^ d £eri»tion of the 

^."^^TSSSTlJsD shape and there- 
molecule, a factor wrucn oc «="? DNA strand. Oose anal- 
fore its ability to b«J«to-wa^IWA^~ shcd) 

rsirf^S^ esters 

named on the ban of its pren^d mvo sh|)wn 
xation of cellobiose, ^f r ^ f£ ^bolism of the 
ttatthercal.&BC»W«»»» 1 g^ ffl ^ ^ ha s been re- 
oisaccharide ^^a^SeScbitobiose) opeton. 
named cAWt P?rt of he ^ CJ^W iw s 
The significance of ito «« WjgLJgJ V nzyine s concerned 
a functional link both jjo and to l . hc 
with sugar metabolism (e.g., ™ .^.binding proteins 
higher-plant cupir-s, pa rticu larly * c r ^f^ addition al circum- 
(detailed below). In thus context. «w» in tha , vicilins 

tantial link between chitob.ose and cg»* » ^ 
from cowpea ^'^ ^ Sin-induced inhi- 

(248 ), and b^J-«3^^^ of the protein to 



TWO-DOMAIN BICUPINS 

The first two-doma;n V^^ t ^^ 
the eupin superfanuly were the sc* im B P „ thc 
these are discussed below, Srtl mierobial 

structural analysis of cupms. re "^ ve Dectl sh0 wn to 
proteins from arcbaea, bactena, and ito^ ™ ^ infor . 

Save a two-domain ^^SSSZ^^ 
nation has provided a new »^ ^ e vwious sub - 
origin of the seed proteins. To daang d .„ termS 

claLs of two^lomain eupm, Jg^J *S»£ ner this spacing 
of their -^r^bSlS o"» (betero-bicupins) in 
is thc same (homo-bicupms) or u 
thc two domains. 
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i.Hyditw-a- Na P whoatE D,oxy8en8S 



Identification of ^°f^^^?^ t ^° X! '^' 
1 iSSwcn-e i GD °\^ y S the prepay 

S^K "d»^% t S ^wteriM) in TOW 
«Sos have P"" 8 ™ iL, (Bel»«« )»"»• 

n \rfi For example, a *„.ai n asa strain U2 too; - 

very sin** po^P^JSHf a «> nti S <^ lP i5Jr An 
Pacruginosa„Cootig5<lJ no 



ot n„ very **- <* W^J^ 

£SJ£» found in «j^53tSrj£ Sshly -Hne -I 

previous comment - MjJ-Jfc w * cl 
these mo types of •hW^^M tel3530667) to; 
aU (305), who sho^d that its P-JJJ? 

a towrimaarity to the HNDO ^ M|hwdB p«bng t**- 

cfl ^ 0 ^ S p.strmnKp7(134M ^-2'^boxybenzal- 

^uvatc. a tmg cleavage c ff C cted by GDO- 
SSJtoted carboy ""SK^Sed in this section have a 
Bo* classes of e ^ C Q tS « apparent subunit molec- 
mu nimcric structure; GDO has a PP^ ^ have c , th a 
ular mass of 38 i?, 39 ^^ hJjajcric (151) composer, 
tctrameric (85. 281. 305 ) a o{ 45 < nnd u 

^nereas HNDO has a jJ^S!) l*c most other d.oxyge- 
..iitaied to he hexamenc <}^>' ~r t clcave a n aromatu 
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S- oat 

C. 

A- pho 



rognr^nwi mj; - JfcEWGI^L YCNftHI TA_ 

:MXlWHKE->r»YWIY35ARVTIV0i:KGK5 
-AKWAYVLKC3TQISAV!>HEGRN 



LEA ttRlBKTJnmKW- 



JOENGaSYIDDVEtol«NrS3CVA»SIQflL0KG 
-XADVQf GDXJfYTPACIPPfllQG LEH~ 



I CErLLV»SEAKrsSN0tLL2»SXm 

•balm r*DDPEG5PriiAnro£GMrKDDOTLi-liw 

'HflLQG L5J> ~MQTt«iIT0OGiar5£ESWT,X.TD!l 

. rt «T^r4Tru CETIXVJTODGNr STLT I SMI 

■ - -~ AKirLLVrSDGTTS SGAXFSVTDIf 



F-vEvraN?msoi sink 



5. 
C. 

A* pho 

B jnxh3 *■ — 

r * »* * *«** ** 



^TT»VNNV^HI^KELVQNLPQAETP 

5™ WUASSySSVLSOTFQISPELTmPVQDTirSlPIQP 
* » ~ 



Sfyft. " | , 

iSad wio, a r e ; he , a S n ^ C 1^^^"* «™ "'""^ ^ enc " ,e of ** eeBe - 

insertion of w> additional nucleoid at residue W to com. ^ 



HND 0 contain 1 mol of Fe 2 ' pet mol of subunit (those 
globifonnis and BadU* brev» conta.n 
«*^rXou$ they utilize the same coordinating rcs- 

contJrmanganese (238. 239; S. Borncmann, personal com- 
munication). 

Oxalate Decarboxylases 

Amone the many oxalate-degrading enzymes isolated 
fxoTSi, possibly the best characterized* that from the 
^od?otfing P fungi Collytia ^J^l^t cX 
mo-bicupin enzyme (intennonf spacing of 2D la i » eaa 
domain) degrades oxalate to formate and carbon dioxide 
£d aSearfno, lo have any recrement for 
wa5 therefore selected for use m strategics to reduce the 
levels of endogenous oxalate in plants (198, 199). 1 he en 
Sme itself has an acidic pi. is stable over a wide range 
Moderately thermostable, and has a molecula massof 560 
kDa as estimated by gel filtration and a subunit ma* » <* j£ 
vn=, hefere and 55 kDa after treatment with endo-p-JV 

n9c?> The sequence of the C. ve/wmes enzyme has been 
publish^ as gil 1604990 (52). and recently the sequence of 



a similar enzyme from Asp^llus ^-J-g 
(C J. Seelongc and D. L. B.dney. 1 October \99i k Ptl 
Latent application WO 98/42827). Presumed homo ogucs of 
Sse sSJences have also been identified (Dunwell. unpub- 
Shed Mse<= b^w) in the bacterial specie, B. 
tvptococcu* mutL (encoded by nucleotides 555 to 1676 
from conlig 1009) (Fig. 3). 

Sucrose-Binding Proteins 

Among the two-domain relatives * "jJ-Sgg P ^ 
teins is a sucrose-binding protein (SBP) 1 <P , f^?™ M ™ 
ril 2765097) found at low abundance m the plasma n^Jnaoc 
S cowledons, leaves, and mature phloem of legumes (103) a 
Sequence ^(2148163) from *c cgd Za*» fig* 
rra is known (40\ Recent comparison (219) ot tnc soyocan 
SBP seqJeTe with that of vicilin has shown that the N-termv- 
S domain of SBP contains 12 of the 13 residues conserved 
STSs whole Vicilin family, with the C-term,nal domain 
having 10 of the 12 conserved residues. 

Avouch the overall tertiary structure of SBP can be pre 
dieted comparison to phaseolin it is 
vsis of die disaccharidc-binding domain of C=J>/ChbR (see 
"AraC-typc transcription factors" above) would provide fur 

th^infomation on the specific ligands in the bindmg s.tc 
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TABLE 1. Tmr-T - "T* 



v r*M&^*Wi^ _ 1 " ' --^^V^ 

: " ~ ~ 7" «|3256M3 from jyOT* tortoWL ***** PMI 

15*5808 (316) lO^HWb ^ abovc 

40OH17 l-^ 49 



Closest neighbor, pesw< 



ihlc function 



Archaca . 
DcsulfurocQCcus strain &t 
Thcrmococcus sirain KS-8 



Eubactcria 
Aquifcx aeolicus 
Aervmonas enviac 
Mycobacterium leprae 
AHeyclobacillus addocaldarius 
Scrvptomyecs tivkkns 
Bacillus stcarothcrmophuus 
Dcsulfcvibrio desulfuricans 
Pxudomonas Itmoignei 
Monsarulla morganii 

Coryncbacierium giutamicum 
Syntchocysfc 

Azorhizobiw* caulinodans 
Mycobacterium genavense 

Eukaryota 
Arabidopsis tlutfana 



2983162(53) 
42W207 
1377767 
39300 (157) 
48953 (234) 

56on2y (216) 

49285 (278) 
531465 
508518 (6ty 

2342561 (137) 
287460 (178) 
763059 (92) 
2558999 



1177288 (241) 
987518 



7591-7914 

232-23 (negative strand) 
1906CM9155 

2196-1755 (negative strand) 
171-1 (negative strand) 
338-1 (negative strand) 
238-726 

1618-1286 (negative strand) 
1666-1184 (negative strand) 

3686-3357 (negative strand) 
3-208 (add G at 104) 
1-201 

947-1519 (ATG »l 960) 

302-3 (negative 
1195-1873 (add G at 1288) 



15-164 pillTIGRlHTMBXW Thermotogo maritimu 

2632508 from B. subtilis AraC? 

« 12984230 from A. 

gt 1 1907078 human pirin (304) 

^^146533 contfg 230 from P- aeruginosa 
«£ lSS cSg from 5^--^" 
SSffi owtSE BS4 from Bprdetella pertussis _ 
55K5f from teifc* sugar alcohol 

git 1772621 from Ewnta chryxanThani 
AS above 

AmM-60077 contiR 229 from P- qctu&wm 



Seed Storage Proteins 



accumulation of nitrogen and boused 35 a s0 urce 

proteins that can withstand £^jg^**1d»S* 

type of ««^-PT^^r f SS« usually found as 
levies and the ^^^S^Tffident/nS), with 
hexamcne coynp ^pLex consisting of 

each subunrt denved from a precurso ^ ^ 

two domains, « N^™j£ Sated following proteolytic 
basic & cbam, which remain { ^ cac h 

proving. .T*e latter ^S^fi'S « subject - 
subunit being a aO- to 7 ?*^i^ , ^. t f on of F „, 1 shows that 
variable levels of processing, ^gf 11 * J&e conserved 

most of the storage P rote ^^ r J erv eS in motif 1- It is 
His residues or contam a angle «^eiveo ^_ bk , ding 

turned that, as a ""J^t^rto «, however, 

a massive »«^ at,0B ^l v Sent in soybean (131) and 
weightD during early seed d ^p™enn n soy 

prefumibryin other ^^^ffsubLte for a 
The possibility that ^.^"Sw rSvWcd by the storage 
residual ^^^X^cS^^ of 
proteins being FW*gdjJ pbasco»n (177) 

sa^-^s> ^ subscqucnt 

predictions of cupin strucoaresl rotcins found 
P m addition to the we Mm "J^SS ^oteins oi this 
ta seeds and spores (261). <ffi^££*. ^mong the best 
type have been the subject ofdctaUeo an j> 
S^ractenied is to » fSSfe protem response 
of tb e viata famdy ^(43 54, 258) an^ ^ In 

for the i^^KfC STown using moleculax mod- 
S?hS£ » 8 unea?Tnin«noglobnlin E-binding epitopes 



for transgenic approaches (66^ ^ l tc mw ' ^ 
residues. Like many °J " ^ b !^«L hi a very high level 
scribed in this review ; J^yKSSfaA ^ thods 
of stability; it m ^^tStinal tract or its » 
and also resets d g^»on^e ^^ d th 

vitro equivalent (23). It has neeo » ^5 thc 

stability may be due » «3^SSSSittW its passage 
possibiUty for protease *S e ^onano biophysical 

S^aSS 



Bieupins of Unknown Function 

As described above. ^ c ^^S>0 £ 
variety of bicupms JgS » SwH» 3110 

Halofcnx [85]), many ^""S^ncodcd by conrig 272) 
tococcw pyogenes (the ' ^4). With the 

and several eukaryotes (e-g., and the OXDO 
option of the two cl «f "^Eto^, no biochem- 
icalfunc^n has yet been asagneo the aCtivitieS c f 

would be of P^'WJ* Xh now probably repre- 
th e fouraamplesfrornB.^««. pf okaiyotic cup«n 

scnts the best <>rgan«m ^ «ujy P for 
diversity. ^ *^g5iS rfbkuito tto^— 

CRY^C SEQUENCES ENCODING CUPIN PKOTElNS 

* addition to the «^^ ta J^Cl5ff5 

there is a group of ^jSiSyMW in the 
(Dunwell, o" P artial ORFs, often found 

bases. These are either complete a pam 

ssr 5e S'S^ -° rdinB to Ac rea - 
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JMWELL ET AX- 



Motifl 

auenc* Gent Site Strand 

r— _ ^ n ljlult-lHlb . • - • 

32815 &f £ 
533149 ^ 



iYNTLU 



2632741 « 95 + 

2638545b A1U ° 
2632720 >* 3a£ 472 + 

vkfZ "> 429 
VtaC 27^ 

2634260b J*** 21130 
2E35BZ1B yw< 

2B363 17 jnae- 
2636534 3^ 

strand 




1 _ ^ N Q{P V 



18 



3d a f r 

ST CP E 



AAs pi 



275 6,10 

241 B.48 

315 a.9° 

315 



ESe F 5V8 316 

oSp FTLfelA OMP D F T 1 K 316 *° 



2533733 
2635101 



IhIBt. 



t n e 0 1 Kali C 1 ■ B2l 




IyIbJl 



lkBtt 



I KA* 




235 5-19 

23S 5.19 

17fl ^1 

186 7.3S 



161 5.05 
4&. 5-81 
330 *'5 Q 



Cort**™ 111 flWKtfD aBBn u r nainc ^ in ibe genome 

gg^esssaw**- 

,:.u^\ had SUfiRCS 



he algorithms used id ^ ORFs in * ^ {rom other 
The occurrence of such as that con- 

in the sequence. 



^vsis of con* ^ „ f 

/OthouBu * bw^JJJ^ overall occ^ncc* 
considerable *^J£X«l{ ■ across wj-gS Uy 

ra ,ely the spectrum ofcug- r^^S^ii' 
already toown W) tn ^ ^nobacttnum^ 

Sfi.bSS(«>» wl, " ,n>,yi,u *^ (P 



c— — -as* . nicthods 

A .aWs, of the ^.aSSft^^ 6 ^ 
ativclylongmtcnnouf dista 
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TABLE 1 Analysis of the eldest neighbors for ctt* of itae cupin sequences from B. subtiliS 



Species 



(SO 



Similarity L p«P* . , _ j> 



Functitm^ 



2633149 

2632815 

2633418 

2632508 

2634298 

2632900 

2633556 

2636105 

2636545 

2632720 

263274) 

2636309 

2633581 

2633733 

2635101 

263426U 

2635821 

2635598 

2636317 

2636534 



B. subtilis 
L monocytogenes 
B. mesaicrium 
£ culi 
P. kio&iatfii 
J}, subtilis 
B. subtdis 
R. subtilis 
A oeolicus 
M. moTjpnii 
S- mcliloti 
E. faecalis 
D. radiodurons 

A. acalicus 
P t persica 

B. subtilis 
B- subtilis 
C albican* 
A. actinamyc* 
P. aeruginosa 



2633014 

2745844 

2764541 

132526 

2495367 

2636105 

2636105 

2632900 

2984227 

5085*8* 

Unfw- C 

Unfin.' 

2984230 

1916809 

2635821 

2634260 

Unfin/ 



96 
262 
265 
248 
243 
311 
316 
311 

61 c 
155 

86 rf 
136 
175 
128 
379 
379 
101 
270 
236 



31 

24 

25 

23 

27 

55 

56 

57 

32 

41 

42 

26 

37 

33 

22 

58 

58 

32 

41 

23 



58 

47 

44 

41 

44 

67 

70 

69 

61 

55 

61 

44 

57 

57 

38 

75 

75 

50 

62 

40 



7 

12 
8 
10 

<l 



14 
7 
2 
2 
8 

14 



5 
4 
0 



5c-U 

5c-17 

9c-14 

fic-13 

5c-17 

e-102 

c-106 

c-107 

2e-04 

c-J8 

2*-12 

0.36 

5c-21 

3e-23 

0.11 

c-134 

c-134 

2c-07 

9e-66 

2c- 10 



ArnC 

AraC 

AraC 

AraC 

AraC 

PMI 

PMl 

PMi 



7PKS 



GLP 



?CDO 
dTDP-DR" 



- Estimated by of the gapped BtoJ PJgt artd a previously unidentified polypeptide (scr the \C*t for details). 

- ^2* r^oos spaTthe cacexved twp-rmnlf section of the 
' <TTPP-HlchydrorhiUPnose reductase. 



protein sequence, as estimated by a BlastP analysis, is given in 
Table 1 In terms of function, it can be seen that the sequences 
can be divided into various subgroups that include five AraC- 
type transcription factors, three PMIs, and a cysteine dioxygc. 
Sse. However, an obvious problem inhere* j*** W« 01 
comparison based on the total sequence is that it takes no 
account of the occurrence of multidomain proteins. For exam- 
ple- analysis of sections of the SpsK protep suggests that K 
probably represents a bifunctional enzyme similar to one from 
Actmoblcils actinomyoctcmcomituns, with an terminal do- 
lam presumed to have M™^?™^™™^ 
activity (cf gil2650312 from Archaeglobus juigutus) and la C- 
terminal domain (containing the cupin element) wto dTOP- 
4-debydrorhamnose 3,5^pimerase activry (ci. gil 26^921 

Irom Methanobacierium Ihermoautormphicum) 

The unusual protein YdaE is most closely related to a pre- 
viously unidentified protein from MorganeOa morgan*. 

Additional confirmation of the different functional sub- 
groups can be obtained by examination of the pi values grven 
in FiR- 4 This shows that all the transcripts factors have 
^between 6.10 to 8 48 whereas the other proteins (with 
Se^ecption of YjlB and YrkQ are more acidic, with values 
between 4.41 and 5.90. 

Domain Structure 
There arc 16 single-domain and 4 two-domain (bicupin) 
proteins encoded by the B- subtilis genome (Fig. 
a c referred to below on the basis of their mtermotrf spacmg 
E. 15+15 20+20). Of the former group of one-domain 
equences, particular 'note should be n*de of Q£ rwc , example 
that have a spacing of 20 residues, namely, YkrZ and YrkC 
?£ former I mo« similar to a recent* described sequence 
from the hyperthermopbUic bacterium Aqufa "ohcus 53). 
whSeas the second sequence is closer to a sequ nee from 

^babtythe most interesting f the latter group of bicupins 



are the two sequences YoaN and YvrK, which have a very high 
level of similarity (E value le-130) to a sequence from Svep- 
\ococcu: mut<ms\«*«% ">09) and to the oxa ate decarboxy- 
lases encoded by gil 1604990 from Colfybin velunpcs, a wood- 
Sg^diornylete (198), and ^related sequence from 
Aspetfkt phoenic* (Scclonge and Bidney paten t . .ppbc* 
tion) These fungal enzymes are related to the Synec^m 
Sin gil 1652630 (69), tbe only other 20+20 microbial bicu- 
5k identified to date. Detailed inspection of the six-sequence 
alignment provided in Figi 3 reveals two main features. First, 
ETare 64 (c. 16% of the total) globally conserved residue* 
Lstly clustered within the two cupin mot.ft wh«h have the 
composition GX^HWHX^EWX.G and GX.„HX 4 O 
^^64 residues! only U (ca. 3%), including the 3 histidmes 
also show conservation between the first and second do- 
mJns. Second, the fungal OXDCs arc more similar o he 
^uences from B. subtilis and S. mutans than they are to the 

*^S5g£-*-l Protein sequence (datanot shown) 
suggest that the most likely single-domain progenitor of the 
rwfSomain 20+ 20 proteins is YkrZ and that ths protein ,s 
more Mr id YvrK than to ^KT^ £ ulionary 
time course of events is thus indicated to be (YkrZ) X 2 -» 
K^oaN. Similarly, it is likely that YjlB (18 spacng) .s 
the progenitor of its closest neighbor, the two-domain YnO 
S-M5) Sequence (Table 2), although this wouW imply that 
he increS in intennotif spacing from 15 to 18 rescues in 
YM To^ed after the dupUcation event It is also not«eab e 
£ Sgmncnts of single cupins with the, putau v^up «. 
derivaUves that the single-domain sequences (c.g., YjlB) al 
wSTSS a higher degree of similarity to the C- ernunal 
JnmaL San to £ N-ter\ninal domain of the respecuve b.cu- 

Pi "Sgn^^are based on the DNA rather than the protein 
scouencTaddilional features can be observed (Fig. 5). For 
SplMhVdoublct of bicupin genes (yvr* and ^A0 are very 
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4260: 
A £821 

\4260 
\5B22: 

4260 1 
\S82U 

I 42 60: 
£821 1 



\4260- 
£821: 



1151 ATGAAGA — CAGAA ^CCTG^GC^CC^TTCGAACTGATGGAGCTCCA 1105 I 

9528 ^. ..A.AA..A- ,TG. .A.T y; .. ' "^r^.;^ ;/ A '* ...J 

SSI 

57S > .A.CG.TA.A.AT C TGA-A.A. .CCGG C. -Tfa 



975E ..AftA.C- ICC T. . .C.G- -G. .GC--« |. -C. .xi.„, | 



gai3 . AC t. . a - *** 



15321: 



|5ST1: 

I 5821;. 

\ 4260i 
\5821: 

\4260: 
\5B21: 

42601 
5BZli 



4260: 
\SB21: 



4260: 



4260 : 
5621 : 



I 4260'. 
\S82U 



80-7 CACCAAGACGGACGA^^ JgJ 
9873 - -TG. -A. A. .G. .C-GC. .T- , - .AC .CA ^-A 



•747 GCAGGAATTCC^^ Sw 
99 33 T. . . -CC.C C C. .A.CSC |GG.GG.AG.T u 



0108 



5,4 »««M^^ oil. 

0109 G, -T.G.AA--A..AA-A T.GAA. . CCA.C. . 

516 ««XTCX««--^«^^ 0 " 22 | 

0164 ..T..-GT.GA.6GG..GA....CG. G..T-.AT..A vu | 

46! «CAACC C CCA*TTC**T^^ S§3o' 

1^23 GAG C- - — -G. t -CTG. • 



022 



,02 CCAATTTCAAAAACGAtAGCCGCTG^ 

,281 AA-G.CT C.-C.,AT.A-.G..C.-AA...-.A..A.-q^^ 

Motif 1 



39 



028 

tlflOTIT 7 

0340 A. -G. .C C. ,G. . . ,C«C.C. > ■ • ; ' 3 



2 B« TGACGGTATT'X - - ATCGG AAAJGGGACTGCCCGCACATTTGATT ATAGAGC^^SS^T 22 
(399 . ...C..T-..GC.-.T. .CG-.CC.-.-.-A.A.-G...A....CCA...4 li _ . 



7 



039 

045, e t . -A.y.bc. ■ ■ ■ — ■ .T- -ce-Ts ' •"• 



|„«0- 169 ATGG-TTTTTAGAAATGTTCAAAAGT^CCGCTATGCA^TGTCTCACTCAATC^GTG^ 1 1^ ( 

\S921: 0514 -..TC C G*CG..-»I *" " ' 

5322: 05*73 . T . . CA . . CTT . . - G . . AC . . » T . » T - - • GC , - - <- 

I WO« 51 GAT TCl CTGCGCAAGAAG AAA— GlGCCT-GTTG T GAAAXA 1J 

5«2JL: 0632 . . -GTG . .XT- . ■ , - A-CA. ^ 



senior tot m*** S^JSS i*jffiS2W S ».w V - ^ »u»« «— ~ 1 - 7M < 2560 ^ ^ 7 - ' 



£ idcmilics = 75W1.151 (65%). 



tar to each other (65% identity; E value 2.2e-72), ^teough 
, gene has a different pattern of insertions and deletions 
els} However, these differences in nucleotide sequence oo 
disrupt the conserved two-motif regions; where there arc 



indels within these motifs, they are equivalent m the two genes 
and do not alter the globally conserved residues. 

in an earlier study (69) it was suggested that he ^ornam 
OXDC proteins may represent direct progenitors of the rwo- 
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Physical Location of Cupin Genes within the 
Ph,S * 5^ a"—-"^ pNA strands, 
The cupin ^^1^^^^ 
8nd although they are in Skilobasc value as the 
ffi* 4), there is a possible acre* c in noticc abIe that 

Sple^ty of the P^^trK and^Sv) are on op- 
the two i-o^^i^ofSB two-domain sequences 

^o£e^^^ 
kb 2000). 



w are kveral -portant.cord^ J^tHlK 

previously unrecorded ^Sal ^tn-posi^ spec**; ^ 
Lai of 4,100) m ^^^STf^o types 
group of sequences proves cvtoen ^ Qf th g 

^plication having occurred dun«g» progenitor^. 
su bufc genome and/or the J?™ m ^ aeaS6 the number of 
Brk there has been dupb^ J 0 ^ haS 568 (14*) 

Spin g-es. U is **g^SS!£* ™ C») - £ 
of its 4,100 genes in the form t obviouS eumpfc 
form of triplets. In the encoding two-doma.n 

of a doublet isyoatf and yvrK, Similarly,/^ 

• proteins closely ^J^JS^jit^*** 
U iu two related "f^f^SSpitoa factors with idcn- 
Se genes eroding ^Sof an even larger gene 

if this class of «^«^JSS-tao or fusion to produce 

• : Second, tbsrettevidenc^dup^ ^ More- 

the genes encoding the «jSSdV.l ^ C v °^ 
over, this proc^musth.ve t^ite ^ ^ 

for ^u P= nsj»as ^$ orthis ^mpdor, 
and 2 X 20, ^*°^i bC ? h l ^ ctfam progenitor for each 

thScorflparisonvnthoth^n^ 

date, the A »Mto ^ oS« apd 

of both the overall jS^terest U thC ^ 

Sable in the P^^ffiveS identified in pro- 
20-aa-spa«dnBCupms«njenc«tot.a ^ „, 

karvotes. Together wifc the polypeptide en- 

5&4230 473^8 S 235 from 

£ded by mideondes 46B06 » 4 ^ « significance to .our 

origin of the increastngly 

understanding of ^ ^ «* d *»» 

^portanteukaryoUcGLP^wtucn^^ obs ervaUOns 

51» S-A->c multiple of th. 



of cupin se^ *^^J?£££ 
Sown that GLPs occur* kn»L jpe^ JJ, (6) . 

including ftoia. c«n*«« Whether this expansion in numbers, 
-if* pot ^own^!^, ^ m 

from the gymnospenns. 

^„™c ftF cTjPiN COMPOSITION 

R enes vanes from 2-7 in te ,™Zg|^ to be the most 
late and ui^ui/« & sabtiUs and Syn- 

Spiy branching and I animals (Dunwell, 
(65) to >« tab- there are 

unpublished). «*Sk SZln bffurther subdivided on 
about 20 Gl^.« hlc ^^Xc w s^en subclasses. Apart from 

protciivs. execotion of the cpimerases (e.g., 

V interestingly, with ^^^L^ cupms appear to be 
6i | 1666505 from Up^ospra ^^f^ £ e analysis of the 
absent from spirochetes, as ^ff^^ (genome size, 
nhMr small genomes of J^ ^^ ^ 
1 44 Mb) (79), T *P° ne ™f,^v C of the ancestor to mito- 
L„««fcB (the closest ***«*f?l0 Mb) (9). ^ d CW ^ d n 
P chondria, ^ » ffiSf ^ T erence, the g^e size of 

cupins U part of ft"*^ ^eSular biosynthetic htne- 
strletedtb fimgi. ycasts '^f £Sts ^ vvidesprcad occurrence, 

from A O^^'W fe' 27393 ^J'faSttan some of these animal 
2T(pl28ZZl26). lnl ^^rS^genases, and epime- 
sequences, such as P^J^T^Sis described above 
Si, axe related " ^^/Sromeric proteins (282) 
while others, such as * e ^^ tion factors, are speofic to 
and various zinc finger transcnp 

eukaryotes. 



k * it has been assumed lhatttie gene 
In the discussion ^oye, ith^ higher^rder mul- 

duplication events to f^^cterial genome • 
tiples all took place » fJ^Sal eenome of B. subulx* the 
s, a possibility JjJSSlSwl genomes (either .« 
result of one ° r x m0 ^° th * e Ablets of genes representthe 
whole or in part) and that the ao Large-scale analyses 

^sequence <* ^ ^ ted to the suggestion thut 

S protein sequences baveah .ady 1c ^ ^ 

Revolution of the bacterial lineage ^and 
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40 242) In the same context, it has also been proposed 
that the eukaryotic nuclear genome is a chimera that has 

iKly lead to a complicated slrV&ure for many pro- 
tic gnomes, with gene doublets resulting from either 

i or internal duplication. 

dad Location of Cupin Genes in the Bacterial Genome 

important aspect of cupin evolution concerns the fact 
•u^ genes are distributed on both DN A strands of the a 
^chromosome. This suggests that at some stagedunng 
,tion there was a duplication of genes from one strand to 
3«r' "The specific example of the 20+20 bicupms is 
for example, it can be seen that m ft"****" 
: is one such gene on the left strand whereas m B s£mk 
two closely similar genes arc transcribed by different 

£y2 of the location of the cupin sequences within the R 
fa senoine (Fig. 4) shows a general progression, with Je 
s encoding ^shorter proteins being located closer to the 
ToTreScation than L the genes encoding the longer 
ci£ Similarly, all the two-domain sequences are located 
"Scondhalf of the genome (from kb 2038 to4106). There 
tie Sormaiion on airy factors) that might determine *e 
^r-order location of genes in this speoes, alfcough .t has 
rSgested (166) that the "grey hole' 1 located at Kb i -600 
St T SSed to the temporary chromosome P«ot«» 
«d during the first stages of the ^™» a ^ f ^ c ^: 
,t of about one-third of the chromosome enters the P r«- 
remains the sole part of the ^chromosome m the 
snore for a significant transition penod (311). In light of the 

K between cupin proteins .^^ ^^desicea- 
, 20, 69), it is possible that there a functional reason for 

catering of kenes required during this stage of the We 

Sere are two additional pieces of evidence for 
*S bcrween related cupin genes and between such gene 
rghS Inctionally related genes. Most ******** 
•msSy that the recently sequenced genome of A ueobcus 
^rcScnts the earliest known stage of bactenal cupm 
lution a^uggestion supported by the fact that the farruly 
£m?S« most deeply branching family within he 
ctS Srnain on the basis of phylogenetic analys-s of 16S 
Sa sequences. Specifically, A a***? S^ome contains 
S : cupin genes. Between aaJ528 and aqJ29 th e«u a 
yptic cupin gene encoding a protein with a 15-aa spacmg 
^l*r to ei 1 2128971 from Methanococcus jannaschu, and m 
Son Sre arV^o other closely ^ agn genes 

& diveS in ?Suencc by the addition of 12 bp (encoding 
aa) to the intemtotif region while retaimng * tela* *phys «J 

ii ion in the chromosome. In contrast, 
I. subOb (yjlB tndykrZ) arc located, more than 200 kb apart, 
.resumaniy as a consequence of re .~ mbl f^ lation , hiD bc . 

In this context, there is also an interesting rebti onsbrp ^e 
ween the ydbB gene and its adjacent sequences in the * 
S Jnomc. A TBlastN analysis of this gene shows tiiat Us 
toest relative from plants is the wheat protem senna- * 
S3 because of its high level of expression in gcrnunatrng 

mbSos (172). Adjacent to ydbB is g*B (for "glucose starvu- 
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tion-inducible protein B") (gi 1 2632740), a sequence that has as 
L closest neighbours (E values 2e-18, and 2e-l5 <tq~»*) 
the.plant Em protein (jl 1169315) from A f»gj 
^^:^a^te\^brybgeiicsis"«by«d^) protein fel W 
^rn barley (75, 243). It is P*^^*^'*^^ 
adjacent bacterial sequences have a common developing 
link that began with a role in stress response, < was reined 
throughout evolution, and is now associated with embryo de- 
velopment in higher plants. 



Comparison of Single-Domain and Two-Domain Capitis 

Detailed analysis of various pairwise alignments of the sin- 
gle-domain and two-domain cupins show a number of inter- 
eting features relating to the possible origin and I evolution ot 
these T proteins- First, as described above smglc-domaifl pro- 
teins are more similar to the C-terrnina! domain than to the 
N-tcrminal domain of their respective two-domain «lajivK. 
The reason for this disparity is unknown, but it is possible that 
there is less selection pressure to conserve ^SKoSte 
K-tenninal domain in a two-domain cupm. Another posawe 
Sanation for such variation is suggested by 
cries in the family of extradiol dioxygenascs, where there an. 
ST stoale- and double-domain members (74). Among some of 
to uS enzymes, there is evidence that the two domain, 
express different phytogenies, suggesting 
these particular enzymes arose from rccombmation or even 
fusion of genes encoding different dioxygenascs. 

Cupins and the Comparative Structure of 
Microbial Cell Walls 
The very close similarity between the two-domain 20+20 
bicup* $ 16?2630) from the gram-negative ^»>ta»m 

CO and the S£££ 

and Streptococcus mtitam (Fig. 3) supports tejabsuntiM 
nS« that the cyanobacteria constitute one of the decpest- 
bSnS cSefwithin the ^-negative spec^ and a 
. dose affmity to the gram-positrve species. ^J^^ 
□noted by Gupta (105) and based on sequences from g&tt 
nSTdSvaScnU phosphoribosyl formyl glyonam.dine 

gWpositive bacteria, the cyanobaeteria, and the ardhaea- 
Additional circumstantial evidence for a close ^ b ^ CC " 
some gram-positive bacteria and some ^ aea , cu ^^ m 
analysis of archaeal cupins- Whereas only two such se 
querns have been identified in ^thanococcus jann^ch^ll 
sWcn have been found m Mcihanobacterwm ihemwautoUvpht- 
S CDuraS unpublished). Jt may be that the additional 
r P in^ Z^oautotropkicum are linked »J» -gcjhr 
cc5 wall characteristics, since it was noted by Gupta £05 that 
a number of archaea, including Mrthanobactnwm, exhibited a 
tSSS bomCneous cell wa\l that shows 
the Gram reaction. Perhaps the U. * em °° uto ™P^ B ?:_ 

that thn S more "likely to be contiguous sections of a single 

(?o irrect a fWshift) produces a modifie J. : °^ 
encoding a polypeptide most similar (E value 3e-19) to the 

^t^tt^^ g L^L pathway 
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linked to the production of complex cxopolysaccharidcs in cell 
walls. 

;: ' in the; late 

' (106) has used di^rci^cs in cell wall architecture to *Xg^#£.< 
against the concept of considering the Archaea (307) to be a 
separate kingdom and in favour of linking the gram-positive 
bacteria and the archaea in a grouping of monodcrm pro- 
karyotcs (surrounded by a single membrane) distinct from the 
didcrm prokaryotes (i.e„ all true gram-negative bacteria con- 
taining both an inner cytoplasmic membrane and an outer 
membrane). It is possible that the functional roic of many 
cupins in cell wall synthesis will help in the resolution of this 
debate. 

STRUCTURAL ASPECTS OF CUPINS 

Despite the wealth of information on the primary sequence 
of cupins and their conserved core motifs, relatively little is 
known about Ineir secondary, tertiary, and quaternary struc- 
ture. As described above, the major advance in t|iis area came 
from the discover}' (20) of the global conservation of a small 
number of residues in the plant storage proteins and germins 
and the slime mold spherulins (29), This discovery enabled the 
3D structure of the bean storage protein phaseolin (177) to be 
used to generate a homology model (90) of the wheat GF-2.8 
gcrmin, an OXO, and to predict various quaternary structures 
for the arrangement of subunits. Comprehensive physicochem- 
ical studies (197) of the monomer and oligomer had suggested 
a homopentameric assembly of subunits in native wheat gcr- 
min, but subsequently X-ray diffraction studies of barley gcr- 
min' in the same laboratory (E_ F. Pai and B- G. Lane, unpub- 
lished [but cited in reference 90]) excluded a pentamer and 
dictated a hexamerie or tetrameric structure fpr the cereal 
germins. The M t (-25) of the glycosylated germin monomer, 
based on its mobility in sodium dodecyl sulfate-poiyacrylamide 
ge] electrophoresis gels, had been incorrect owing to a glycan- 
. induced anomaly. The correct M t of the germin monomer, 
based on the sizes of its polypeptide and TV-glycan constituents 
(20 + 2=22) (135), conforms with sedimentation-equilibrium 
measurements of the M r (-130) only if germin is homohex- 
americ, not homoterxameric. This conclusion was confirmed, 
definitively, by a more comprehensive X-ray diffraction study 
of barley OXO crystals in our own laboratory (310); this en- 
zyme contains a hexamcric arrangement of subunits of the type 
found in the storage proteins, but of course these latter pro- 
teins are composed of a trimcr of two-domain subunits rather 
than being a trimer of single-domain dimers. The homology 
model of OXO (90) also confirmed the potential catalytic 
significance and metal-binding capacity of the tfcree conserved 
His residues located within motifs 1 and 2 at the center of the 
P-barrel of many cupins. It is now considered likely that such a 
His duster, together with an adjacent conserved Glu, may be 
the binding site for an Mn 2 * ion recently found to be the metal 
present in OXO, at least in those isolated from cereals (3°, 
238, 239). A similar combination of modelling and experimen- 
tal approaches could be made using the structural data from 
the sugar-binding domain of the bacterial AraC transcription 
factor (270) and the sequence data from SBP (bicupins) from 
higher plants (40, 219) in order to identify ligands specifically 
involved in the binding of either mono- or disaccharides m 
these subgroups. 

SUMMARY OF CUPIN FUNCTIONS 

in conclusion, therefore, cupins are found in a wide range of 
cell types and have a wide range of biochemical functions, 
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including several enzymes related to cell wall synthesis, partic- 
ularly in reactions involving sugar modification, There seems to 
bc^ajC^sistent-associati n , with stress responses in both, pro- 

;fci|^ 

desiccMnlin tolerance (20) is exemplified by the seed storage 
proteins, a specialized and well-characterized group of nonen- 
zymatic proteins; they contain at most a, single conserved His, 
and no enzyme activity is known. One additional recurring 
theme in this analysis of cupin function is the link to oxalate 
metabolism, an area of biochemistry that has received little 
attention recently. The section below will attempt to remedy 
this omission by emphasizing the significance of oxalate (and 
oxalate degradation) in several fields of microbiology, plant 
science, food science, and medicine. 

BIOLOGICAL SIGNIFICANCE OF CUPINS IN 
OXALATE METABOLISM 

There are two pieces of important evidence that Suggest a 
link between some of the cupin proteins and oxalate metabo- 
lism. First, the archetypal member of the cupin family is wheat 
germin, a cereal protein with OXO activity (173) (sec above). 
Second, and most significantly, the fungal OXDCs have been 
found previously to be bicupin proteins (69). It is very lively, 
therefore, that other cupins have similar enzymatic activities, 
and in particular one would predict that the two 20 + 20 bicu- 
pins from B. subtilis and the related sequence from Streptococ- 
cus mutant identified in this review (Fig- 3) are also OXDCs. 
Unfortunately, there is scant information about the role of 
oxalate metabolism in these bacterial species. It is known, 
however, that oxalic acid is among the range of organic acids 
produced by strains of B. sui)tilis isolated from certain Indian 
soils (17). 

Microbiological Significance of Oxalic Acid and 
Qxalate-Degrading Enzymes 

Oxalic acid has been implicated in a wide range of environ- 
mental effects including several biological and geochemical 
processes in soils (71, 101). For example, oxalate is the major 
organic anion in many forest soils throughout the world, and as 
such it has a large effect on the availability of phosphorus, 
aluminum, and calcium (154). This is because oxalate will 
chelate aluminum and iron, thereby making more phosphorus 
available to plant roots. Interestingly, aluminum has recently 
been found to stimulate the production and secretion of ox- 
alate from roots of buckwheat (1B8. 323) in addition to induc- 
ing the synthesis of a defence related GLP in wheat. As well as 
this role in plant nutrition, oxalic acid is linked to general 
weathering of soil minerals and the subsequent precipitation of 
insoluble metal oxalates (87). This latter process is associated 
with the survival of fungi growing in the presence of potentially 
toxic metal compounds (eg., copper-containing wood preser- 
vatives). It has also been exploited in the solubilization of 
heavy metals from bauxite, clay, sand, sewage sludge, and other 
metal bearing materials. In many of these applications, As- 
pergillus niger is favored as the best speacs for oxalic acid 
production (251, 252, 279), in contrast to the medical dangers 
of such production hy this organism. 

A related environmental role for oxalic acid (and oxalate- 
degrading enzymes) comes from its key importance in the 
carbon cycle and the release of CO. from rotting wood (98, 
99) a process largely mediated by basidiomycetous white rot 
funk Such fungi, along with their brown rot equivalents (203), 
arc known to produce oxalic acid (72, 257) and oxalatc-degrad- 
ing enzymes (73, 202) under some conditions. The biochemical 
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or the role of oxalate include its capacity to chelate 
ZszLd tbM to*imulate the activity of Mr, peroxidase, 

Si S of the fungus Bjerkandem m the 
cnified craft pulp during paper making (201, 208, 236) 
SL tiMS of mis type, the level of oxalate, and thus the 
StSS SI particular technology, depends upon die 
SEOCDC d equivalent) ^C^SX 
erv of the putative structure of OXDQ albeit 
%S£ relationship to OXO and seed IMJ- 
provides the first opportunity to consider directed mod- 

^rt^emphasize the key role of man ganesc in 
eSl reactions utilized in both liguin « (Mn 

lSa^ 

seSent of thfeafbon cycle which involves both Je 
on carbon in woody biomass and its later release from 
material. 

Role of Oxalate in Plant Pathogenesis 

■ described below in the context of transgenic plants, oxalic 
S associated with many plant pa thogens pa*™- 
ScfeS* sclerotica (185) and related species I : a 
•ted Sto the plant during the infection process is unph- 
d in Se dSadation of the plant cells, and 15 then often 
co by^he pSogcn or precipitated in the form of calcium 
ate crystals (78, 315). 

COMMERCIAL SIGNIRCANCEOF OXALATE- 
DEGRADEHG ENZYMES 

S PCTCtent aSSion WO 92/15685) origin, many 
*£S£+E~ 3L actual or potent^ I eornmeraa 
Sicancef with applications in many areas °f ^>c»e.^ 

jerfamity will have practical benefits. 



Medical Diagnosis and Treatment 

A, described above, the OXO ^.^g^^Zl 

mown as aspergillosis) (169, 226). Mien : kto b * 
» cystic fibrosis patients as a ^f^ e ^,^Se 
«^at eliminates^ 

aTtStments to reduce metabolic «* » Jgjggjjg; 

,thcr suggested means tf*^^S£l3S& 
, S c of immobilized or encapsulated (18, V) <ww aw & 
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enzymes in the digestive tract or in *e peritoneal »^ or ' 
Sole 0X0 which had been immobilized 1 by adsorphon 

Shito^ 

metal inactwation and was considered suitable for ' WWn»- 
SSSon (231). In a related study (230). rats implanted in the.r 
Snei cavity with dialysis membranes containing banana 
n?0 Serf able to metabolize intraperitoneaUy injected 
rqcXe, Swell as its precursor, [^oxalate. Tne lead- 
L Cpeutie compound in preclinical development « prob- 
ablv lS-62/47 (from Ixion Biotedmology Inc.), an orally ad- 

bacterial origin (M. I-A^^„ S * 26 
1998 PCT patent application WO 98/52580). 

X analSive to treatment of patients with these condi- 
tio^ SS e to reduce the oxalate content of food pnor 

XbUg'eatcn (J* *- ^ ^63*2^1* 
«i March 1994, European patent application 063932* aij. in 

oneS method, *^*7X^fi£iZtt 

inSal juice or the oxalate in an infusion of 
Sack S^e of the major sources of oxalate in the human 
"tSjSSfif tarii roots* a stirred-tank ^ 

5atSt\pp?catio'n). Transgenic -H««*" -duang ox- 
alate levcb in plants will be considered below 

In addkion to this relatively common metabohc disorder^ 
In addition i to ' ^ scnous genetic 

s^quSSatment of this condition is also d-scussed below. 
Human Gene Therapy 

of oenc therapy for treatment of these we ; wreaie ^?,. rh „ 
? 8 Ait^iih their eventual aim was to clone and utilize .the 
tions- Although tneir evenm* *uu Fmlavson and A. 

rccK, j . A ; uv ™ rioninc of the bactcna! equivalent 
first achieved m the cloning 01 in* « Qxnlobocter 
1R7 \ the oxalyl coenzyme A decarboxylase rxom . 
18/) me uxcuy* j ate _d ccra ding bactenum found \n 
frrmigencs, an anaerobic oblate ^^"J d ^ tbe 

the mammal an intestine (4, 15, 165). « ^.P^" es dli . 

scribed in this review will soon lead to further advances in 
important area of human therapy. 



Transgenic Plants 
Resistance to plant pathogens. Possibly the ™ost extensive 

basis of this strategy is the toxic effect of the oxalic 
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creted by many such pathogens, particularly Sclerotinia sclera- 

losses can reach 60%. At present, cantrol-pt the r disease by 
fungicide application to the plant is expensive and not always 
reliable. The oveiwintering sclerotia can remain dormant in 
the soil for several years, before germinating and cither pro- 
ducing apothecia (fruiting bodies which generate large num- 
bers of the infective ascospores) or growing directly into my- 
celia, which can invade the plant stem at ground level Few 
genetic sources of resistance to the pathogen are available to 
the plant breeder, and despite the potential or mycoparasites 
as a means of biological control (306) there is a ^tuwmg 
demand for new approaches to combat this disease (246). 

It has been known for several years that the mode of 
action of this pathogen involves an important role for oxalic 
acid (215, 244, 325). This acid is secreted by the fungal 
mycelium, which either develops on the petal and leat sur- 
faces (138) after germination of the ascospore or grows out 
directly from the perennating sderotium after its germina- 
tion in the soil- As the pathogen-dcrived acid enters the 
plant, it chelates the calcium from the middle lamellae ol 
the plant cell walls, thus inducing embolisms m the xylem 
vessels and causing wilting (272). The acid also inh bits .the 
activity of o-diphcnol oxidase activity in host cells (76), 
thereby suppressing defense responses. In addition, it re- 
duces the internal P H of the plant and consequently stimu- 
lates the activity of the cell wall-degrading cellulases and 
pectinases produced by the pathogen (192). Genetic evi- 
dence for the role of oxalic acid comes from studies on 
mutant strains of the fungus which arc deficient in oxalate 
production and also avjnilent; revcrtants regain their viru- 
lence (95). . 

A strategy was therefore developed to introduce into sus- 
ceptible plants a gene encoding an oxalate-degradmg enzyme 
which could reduce the level of the pathogen-derived toxin and, 
thereby reduce the growth of the infective mycelium and 
spread of the disease. The best results achieved to date (2S7, 
288- D. L. Bidncy, D. G. Chame, S. L. Coughlan, I. Falak, 
M. K. Mancl, K. A. Nazariau, C J. Scelonge, and N. Yalpam, 
23 January 1999, PCT patent application; C Thompson, u. 
Nisbct, H. Jones, and J. M. DunwelL unpublished observa- 
tions), refer to investigations with oilseed rape and sunflower 
transformed with the barley or wheat OXO. Such results have 
considerable commercial potential (82, 111). Additionally , it 
has been proposed that oxalate-degrading genes could be used 
as selectable markers in transformation experiments (BA my 
Improvements in digestibility. Oxalic arid and its salts are 
well known to be toxic to humans at high doses (104, 183) and 
can cause medical problems even when present at low concen- 
trations in the diet. Certain leafy vegetables such as spinach 
and Amaranths have particularly high concentrations , and 
there are many breeding efforts to improve the palatabmty of 
these and other crops for humans, and also for farm ™ mlI ™J* 
such as sheep and goats where aversion to fodder has been 
associated with oxalate content (83, 167). ■ 

In addition to a general reduction in levels of soluble or 
insoluble oxalate, it is possible that mtroduction of oxa ate- 
degrading genes (199) may be useful as a means of reducing 
specific oxalate-related toxins such as the ^- ta ,™£ A n ™ r0 ' 
toxin B-Ar-oxalyl-1-a.p-diaminopropanoic. aad (UUArj, a 
compound that is.found in the legume ^c>Lathyrus*> 
and is associated with the development of lathynsm (233, AUj, 
a severe neurological disease. 
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^n^ ; cMtamm^t : m'indasuTal pVbc^ 
ing of alumina by the Bayer process. At present, the disposal 
methods comprise either burning or burial in landfill sites. It is 
hoped that these methods may be replaced in future by use ol 
various means of bacterial degradation involving either an al- 
kalophflic Bacillus species (209) or Pseudomonas axalaticw. 
(11) an organism isolated from rhubarb patches (a rich source 
of oxalate). It is possible that the recently described new spe- 
cies of obligately oxalotrophic^mmoiwjjMui axalaticw: sn&A. 
axalivomns (319) isolated from the thizosphere of sorrel 
(Rumex acetosa, another species with a high oxalate content) 
could be used for these purposes. 

A recent application in this area is the claimed use (N. O. 
Nilvebrandt, A. Reiman, and F. De Sotisa, 26 February 1998, 
PCT patent application) of axalate-degradtng enzymes (OXO 
and/or OXDC) to reduce the levels of oxalate in the process 
liquids during the production of pulp and paper. Wood con- 
tains oxalic acid at concentrations of 0.1 to 0.4 kg/ton (bark 
contains up to 15 kg/ton), and during processing this com- 
pound can easily precipitate in the form of calcium oxalate 
crystals, which cause problems in pipework, washing filters, and 
heat exchangers. A similar previous application (117) involved 
the use of these enzymes to inhibit the deposition of beer 
Stones during the brewing process. Levels of oxalate also must 
he kept low to prevent the problem of "gushing (sec also 
reference 107) during this process. 

Another potential industrial use of OXDC and related en- 
zymes concerns the use of fungal ^polysaccharides. One such 
compound is sderoglucan, a neutral glucan produced i by Scte- 
rotmia glucanicum and composed of a linear chain of 0-D(l ,3)- 
linkcd r^glucopyranosyl residues with single r^ycopyranosyl 
residues linked p(U) » every third residue of the main chain 
Because of its structure and high molecular weight, it has great 
value asaviscosificr in enhanced oil recovery, a role «t which its 
closest rival .s xanthan gum (122). Unfortunately, production 
of sderoelucari in reactors is accompanied by the concurrent 
synthesis of oxalate (302), an unwanted by-product that makes 
purification more costly. Such synthesis « rtxmutated it « jpH 
above 35 partly because of inhibition of OXDC It would 
therefore be most valuable to be able to modify the ^aracter- 
isucs of this enzyme in order to reduce the level of oxalate 
formed during the production process. 

OXALATE AND THE ORIGIN OF LIFE 

In a recent review (1 12) of photosynthesis and tbe origin of 
life it was suggested that the first stage in the development ot 
photosynthesis took place in iron-rich clays and involved the 
Sotoreduction of carbon dioxide to form oxalate, which was 
Then reduced to gryoxalate with the aid of manganese f^e die 
comments on Mu* + as the active-site metal m bar lej , OXO 
above) This phase was then followed by the entry of sulfur into 
the evolving day systems, the subsequent ability to fix nitrogen, 
and finSy f the involvement of phosphate. Oxalate is therefore 
implicated in the primary production of all organic chemicals 
on Earth. 

ORIGINAL FUNCTION OF THE ANCESTRAL 
"FROTOCUPIN" 

It is possible that the ancestral "protocupin" was a small 
protein (ca. 100 residues with an intermotrf spacing ot 13 aa} 
ffwas very heat stable (cf. proteins gil 2984227 from^ui/er 
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,r example as a PMI awe lD t hcreby a sugar-binding 
was also involved » *f*»*?££^Xl ^NA-binding do- 
3 rm of the cupin domain ; cornbur etfj*£ » uansC ription 
U to generate the «^^J^S^^^ 
actor, In which the sugar-binding element i> t . t B 

Levant to note the dose asonm diversification of 

ira bir^e-richh e rruceUuloses U^H^™ durinp 
Action of the original _cup.ni ^ng^ - n ^ ^terrnotif 
^sequent evolution, * H*^£jKT,t both ends of the 

S 8 ion q (Fig. $!^«^$FS?£^ of a r lu - 

protcin (probably „r fusion of the entire sc- 

rneric structure), and ^P l f£°° or rf fte ^ 

quence to produce bicupin be'assoeiated with 
otic cupin genes have intfons, "^ J c evolutiona ry pro- 
thc occurrence of underlying cupin structure 

cess. Such variation nw<« «^ tn £ ^ ^ the extremo- 
i5 „cw found tof^.^ 'Jfi heat-stable, hexarnenc. 
philc proteins of ^™*Z cereaU (201 aa in each sub- 
rnanganese-contaiomg P^^^Tttimeric seed storage 
unit 1310]) and the ^^^£SS^) 
Jrotrins (400 to 500 aa f^^Z^^g of the 
plants. It is hoped hat ar ' ™P">££ ^ erse superfamdy of 
jU^c-function ^^K^ factors that allowed 
proteins will She evolutionary process and 

noted that the sizes of the two jxm* fa ^ 2) 

"pin signature (20 or of the conserved 

correspond ^i^^n^uS to n»lte np all ancestral 
units (T 5 , 22 and » ->g» ^ *° n d to be encoded by the 
proteins in the progenote (56, 5 /, **) 
ancestral exons. 

CONCLUDING REMARKS AND FUTUWE DIRECTIONS 

The discovery of the ■£ S^SS^ 5 
of an integrated abroad to the ^g^on of infer- 
function (24, 38, J 19, ^ °Sg X-ray analysis of storage 
mation available from the : P«*£»» * genom e 

proteins (155, 177) Tg*gJ^JUm* 
programs was it possiMc to , detetf Ma prote ,ns 
Ltwcen this ^dc 5 PC«rum oUnzy ^ h Qnly a ^ 
a „d thereby to reinforce the * ^^d element from 
lively small number of ««*^ ^ Th is new form ol 

ihich all proteins « j-gj fcoS be debated protem 
biological research. wb.ch perbap^ q£ ^ lD g 

modular biology, is a JPj**^, t0 compare primary 
power of analytical *0^ s Snu data. In this particu- 
scauence information with 313 discovery 
5S»le of multifunctional have un- 

SS^M** 1 * AraVmd S particularly stable 
B-sheet structure has been <»°P<" ble (22 2, 223). pepsin- 
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different metals, a situation^eminiscent of that found in the 
homoprotocatachuate i W-dio^enases ^ 

Future studies w.ll take a «™ 1 a rfami , y 
iMtenainly^ 

the functional ^^^^Sn for the variety of 
^ allow rational pr« proCesS will 

related proteins in cuVaryotes. uuu /ch p tec hnol- 

oe aided by the ''^^^SSSSp^ ?f all 
ogy (61) that is able to mumteO* -V » ^ 0fC cnviro0 . 

^within a single P^^^^iy the role of the 
ten* In particular, it is ^ f XlsSe exernplined. As 
large number of GLPs « Jgf apopblS tic and arc 

described above, many of ^gJ^J^ siress. They are 
assodatedv^threspoiiswtobiotieanaa^ it „ aatgA m 

therefore of P^^'^JlKmSng environments. In 
hnpwi«g«l ? I^^ , SalnS3 additional cupin 
a more fundamental r ^ UI % 3J0 ) (Fig. 2) w.ll 

structures at the ^ active-site residues 

confirm the exact significance of meny extremes of 

fn^ose that confer ra ^XSMestructive ehem- 
tcmperature and m the pre jSpotential value, for 
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